Overview

Dataset statistics

Number of variables15
Number of observations49829
Missing cells365211
Missing cells (%)48.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.7 MiB
Average record size in memory120.0 B

Variable types

Text3
DateTime4
Numeric5
Categorical3

Alerts

ACCOUNT_AGE_MONTHS is highly overall correlated with STATEHigh correlation
AGE is highly overall correlated with STATEHigh correlation
BARCODE is highly overall correlated with GENDER and 2 other fieldsHigh correlation
GENDER is highly overall correlated with BARCODE and 1 other fieldsHigh correlation
LANGUAGE is highly overall correlated with BARCODE and 1 other fieldsHigh correlation
STATE is highly overall correlated with ACCOUNT_AGE_MONTHS and 4 other fieldsHigh correlation
LANGUAGE is highly imbalanced (72.9%)Imbalance
BARCODE has 5735 (11.5%) missing valuesMissing
FINAL_SALE has 12486 (25.1%) missing valuesMissing
CREATED_DATE has 49570 (99.5%) missing valuesMissing
BIRTH_DATE has 49570 (99.5%) missing valuesMissing
STATE has 49570 (99.5%) missing valuesMissing
LANGUAGE has 49570 (99.5%) missing valuesMissing
GENDER has 49570 (99.5%) missing valuesMissing
AGE has 49570 (99.5%) missing valuesMissing
ACCOUNT_AGE_MONTHS has 49570 (99.5%) missing valuesMissing
FINAL_SALE is highly skewed (γ1 = 25.10610229)Skewed
FINAL_QUANTITY has 12491 (25.1%) zerosZeros

Reproduction

Analysis started2025-03-11 18:29:11.080762
Analysis finished2025-03-11 18:30:42.081820
Duration1 minute and 31 seconds
Software versionydata-profiling v0.0.dev0
Download configurationconfig.json

Variables

Distinct24440
Distinct (%)49.0%
Missing0
Missing (%)0.0%
Memory size389.4 KiB
2025-03-11T14:30:42.293609image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length36
Median length36
Mean length36
Min length36

Characters and Unicode

Total characters1793844
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0000d256-4041-4a3e-adc4-5623fb6e0c99
2nd row0001455d-7a92-4a7b-a1d2-c747af1c8fd3
3rd row00017e0a-7851-42fb-bfab-0baa96e23586
4th row000239aa-3478-453d-801e-66a82e39c8af
5th row00026b4c-dfe8-49dd-b026-4c2f0fd5c6a1
ValueCountFrequency (%)
0fb89572-c817-47e2-bd11-6f467baacbb2 6
 
< 0.1%
79151f8d-0b75-48e2-8bb4-2591bc8c9ca2 6
 
< 0.1%
98d68d5d-71f1-4528-a83d-cdf6d308c79b 6
 
< 0.1%
dd03ea1b-0fae-4bcf-bb55-d7e36eaa75b5 6
 
< 0.1%
a634ba37-2988-46ff-8c61-a4cc4acd4403 6
 
< 0.1%
4495fbcf-ad2c-4e4f-a77b-ff2ba6984f54 6
 
< 0.1%
d6a313ee-1aa3-4acb-a90d-f0d962ae7b8c 6
 
< 0.1%
171a74cd-7038-43fa-a3ae-de6b6cca5d36 6
 
< 0.1%
682cb059-74a1-4c47-abd8-5fd6541d88bf 6
 
< 0.1%
6e5ec1d0-e63f-4707-bd6e-78672ecd2a6c 6
 
< 0.1%
Other values (24430) 49769
99.9%
2025-03-11T14:30:42.769886image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 199316
 
11.1%
4 143154
 
8.0%
a 106536
 
5.9%
8 106015
 
5.9%
9 105788
 
5.9%
b 105214
 
5.9%
e 93920
 
5.2%
7 93717
 
5.2%
2 93593
 
5.2%
c 93455
 
5.2%
Other values (7) 653136
36.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1793844
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
- 199316
 
11.1%
4 143154
 
8.0%
a 106536
 
5.9%
8 106015
 
5.9%
9 105788
 
5.9%
b 105214
 
5.9%
e 93920
 
5.2%
7 93717
 
5.2%
2 93593
 
5.2%
c 93455
 
5.2%
Other values (7) 653136
36.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1793844
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
- 199316
 
11.1%
4 143154
 
8.0%
a 106536
 
5.9%
8 106015
 
5.9%
9 105788
 
5.9%
b 105214
 
5.9%
e 93920
 
5.2%
7 93717
 
5.2%
2 93593
 
5.2%
c 93455
 
5.2%
Other values (7) 653136
36.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1793844
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
- 199316
 
11.1%
4 143154
 
8.0%
a 106536
 
5.9%
8 106015
 
5.9%
9 105788
 
5.9%
b 105214
 
5.9%
e 93920
 
5.2%
7 93717
 
5.2%
2 93593
 
5.2%
c 93455
 
5.2%
Other values (7) 653136
36.4%
Distinct89
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size389.4 KiB
Minimum2024-06-12 00:00:00
Maximum2024-09-08 00:00:00
2025-03-11T14:30:43.030877image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:30:43.248871image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct24440
Distinct (%)49.0%
Missing0
Missing (%)0.0%
Memory size389.4 KiB
Minimum2024-06-12 06:36:34.910000+00:00
Maximum2024-09-08 23:07:19.836000+00:00
2025-03-11T14:30:43.463864image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:30:43.678857image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct954
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size389.4 KiB
2025-03-11T14:30:44.119704image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length66
Median length42
Mean length8.7855867
Min length1

Characters and Unicode

Total characters437777
Distinct characters45
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWALMART
2nd rowALDI
3rd rowWALMART
4th rowFOOD LION
5th rowRANDALLS
ValueCountFrequency (%)
walmart 21249
29.9%
dollar 4490
 
6.3%
store 2970
 
4.2%
general 2772
 
3.9%
aldi 2632
 
3.7%
target 1484
 
2.1%
kroger 1477
 
2.1%
food 1392
 
2.0%
club 1344
 
1.9%
stores 1290
 
1.8%
Other values (1262) 29894
42.1%
2025-03-11T14:30:44.807879image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 65946
15.1%
R 48846
11.2%
L 44337
10.1%
T 37344
8.5%
E 32562
 
7.4%
M 27929
 
6.4%
W 24863
 
5.7%
O 23498
 
5.4%
21191
 
4.8%
S 21125
 
4.8%
Other values (35) 90136
20.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 437777
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
A 65946
15.1%
R 48846
11.2%
L 44337
10.1%
T 37344
8.5%
E 32562
 
7.4%
M 27929
 
6.4%
W 24863
 
5.7%
O 23498
 
5.4%
21191
 
4.8%
S 21125
 
4.8%
Other values (35) 90136
20.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 437777
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
A 65946
15.1%
R 48846
11.2%
L 44337
10.1%
T 37344
8.5%
E 32562
 
7.4%
M 27929
 
6.4%
W 24863
 
5.7%
O 23498
 
5.4%
21191
 
4.8%
S 21125
 
4.8%
Other values (35) 90136
20.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 437777
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
A 65946
15.1%
R 48846
11.2%
L 44337
10.1%
T 37344
8.5%
E 32562
 
7.4%
M 27929
 
6.4%
W 24863
 
5.7%
O 23498
 
5.4%
21191
 
4.8%
S 21125
 
4.8%
Other values (35) 90136
20.6%
Distinct17694
Distinct (%)35.5%
Missing0
Missing (%)0.0%
Memory size389.4 KiB
2025-03-11T14:30:45.167412image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length24
Median length24
Mean length24
Min length24

Characters and Unicode

Total characters1195896
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row63b73a7f3d310dceeabd4758
2nd row62c08877baa38d1a1f6c211a
3rd row60842f207ac8b7729e472020
4th row63fcd7cea4f8442c3386b589
5th row6193231ae9b3d75037b0f928
ValueCountFrequency (%)
64e62de5ca929250373e6cf5 22
 
< 0.1%
604278958fe03212b47e657b 20
 
< 0.1%
62925c1be942f00613f7365e 20
 
< 0.1%
64063c8880552327897186a5 18
 
< 0.1%
624dca0770c07012cd5e6c03 14
 
< 0.1%
6327a07aca87b39d76e03864 14
 
< 0.1%
609af341659cf474018831fb 14
 
< 0.1%
61d5f5d2c4525a3a478b386b 13
 
< 0.1%
60a5363facc00d347abadc8e 13
 
< 0.1%
65d4915916cc391732127174 12
 
< 0.1%
Other values (17684) 49669
99.7%
2025-03-11T14:30:45.669424image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 107451
 
9.0%
5 85632
 
7.2%
1 80743
 
6.8%
3 79202
 
6.6%
4 76849
 
6.4%
0 75962
 
6.4%
2 74470
 
6.2%
9 71559
 
6.0%
d 70096
 
5.9%
c 69530
 
5.8%
Other values (6) 404402
33.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1195896
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
6 107451
 
9.0%
5 85632
 
7.2%
1 80743
 
6.8%
3 79202
 
6.6%
4 76849
 
6.4%
0 75962
 
6.4%
2 74470
 
6.2%
9 71559
 
6.0%
d 70096
 
5.9%
c 69530
 
5.8%
Other values (6) 404402
33.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1195896
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
6 107451
 
9.0%
5 85632
 
7.2%
1 80743
 
6.8%
3 79202
 
6.6%
4 76849
 
6.4%
0 75962
 
6.4%
2 74470
 
6.2%
9 71559
 
6.0%
d 70096
 
5.9%
c 69530
 
5.8%
Other values (6) 404402
33.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1195896
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
6 107451
 
9.0%
5 85632
 
7.2%
1 80743
 
6.8%
3 79202
 
6.6%
4 76849
 
6.4%
0 75962
 
6.4%
2 74470
 
6.2%
9 71559
 
6.0%
d 70096
 
5.9%
c 69530
 
5.8%
Other values (6) 404402
33.8%

BARCODE
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct11027
Distinct (%)25.0%
Missing5735
Missing (%)11.5%
Infinite0
Infinite (%)0.0%
Mean1.7157722 × 1011
Minimum-1
Maximum9.347108 × 1012
Zeros0
Zeros (%)0.0%
Negative8
Negative (%)< 0.1%
Memory size389.4 KiB
2025-03-11T14:30:45.927406image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile1.2000214 × 1010
Q13.0772126 × 1010
median5.2100038 × 1010
Q38.5239935 × 1010
95-th percentile7.873591 × 1011
Maximum9.347108 × 1012
Range9.347108 × 1012
Interquartile range (IQR)5.4467809 × 1010

Descriptive statistics

Standard deviation3.2715345 × 1011
Coefficient of variation (CV)1.9067417
Kurtosis204.80754
Mean1.7157722 × 1011
Median Absolute Deviation (MAD)2.6642143 × 1010
Skewness10.047094
Sum7.5655261 × 1015
Variance1.0702938 × 1023
MonotonicityNot monotonic
2025-03-11T14:30:46.121400image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7.874222376 × 1010181
 
0.4%
5.11111504 × 1011168
 
0.3%
5.111110018 × 1011163
 
0.3%
7.874228544 × 1010158
 
0.3%
3.111112241 × 1011149
 
0.3%
4.900000044 × 1010142
 
0.3%
7.874201228 × 1010142
 
0.3%
5.11111704 × 1011136
 
0.3%
7.874209728 × 1010110
 
0.2%
7.87420364 × 101086
 
0.2%
Other values (11017) 42659
85.6%
(Missing) 5735
 
11.5%
ValueCountFrequency (%)
-1 8
 
< 0.1%
2226 4
 
< 0.1%
31059 12
 
< 0.1%
31073 24
 
< 0.1%
33749 8
 
< 0.1%
40136 6
 
< 0.1%
40945 42
0.1%
42185 4
 
< 0.1%
45605 61
0.1%
45643 4
 
< 0.1%
ValueCountFrequency (%)
9.347108002 × 10122
< 0.1%
8.901696552 × 10124
< 0.1%
8.711700967 × 10122
< 0.1%
8.69084021 × 10122
< 0.1%
7.702011027 × 10122
< 0.1%
7.702011003 × 10122
< 0.1%
7.501103306 × 10124
< 0.1%
6.970707749 × 10124
< 0.1%
5.060305162 × 10122
< 0.1%
5.060242153 × 10122
< 0.1%

FINAL_QUANTITY
Real number (ℝ)

ZEROS 

Distinct87
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.80300347
Minimum0
Maximum18
Zeros12491
Zeros (%)25.1%
Negative0
Negative (%)0.0%
Memory size389.4 KiB
2025-03-11T14:30:46.313409image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q31
95-th percentile1
Maximum18
Range18
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.6034035
Coefficient of variation (CV)0.75143324
Kurtosis76.865783
Mean0.80300347
Median Absolute Deviation (MAD)0
Skewness4.1670285
Sum40012.86
Variance0.36409579
MonotonicityNot monotonic
2025-03-11T14:30:46.511388image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 35536
71.3%
0 12491
 
25.1%
2 1285
 
2.6%
3 184
 
0.4%
4 139
 
0.3%
6 26
 
0.1%
5 22
 
< 0.1%
8 8
 
< 0.1%
12 7
 
< 0.1%
7 7
 
< 0.1%
Other values (77) 124
 
0.2%
ValueCountFrequency (%)
0 12491
25.1%
0.01 1
 
< 0.1%
0.04 1
 
< 0.1%
0.09 2
 
< 0.1%
0.23 4
 
< 0.1%
0.24 1
 
< 0.1%
0.28 1
 
< 0.1%
0.35 1
 
< 0.1%
0.46 3
 
< 0.1%
0.48 1
 
< 0.1%
ValueCountFrequency (%)
18 2
 
< 0.1%
16 2
 
< 0.1%
12 7
 
< 0.1%
10 5
 
< 0.1%
9 3
 
< 0.1%
8 8
 
< 0.1%
7 7
 
< 0.1%
6.22 1
 
< 0.1%
6 26
0.1%
5.53 1
 
< 0.1%

FINAL_SALE
Real number (ℝ)

MISSING  SKEWED 

Distinct1434
Distinct (%)3.8%
Missing12486
Missing (%)25.1%
Infinite0
Infinite (%)0.0%
Mean4.5840664
Minimum0
Maximum462.82
Zeros473
Zeros (%)0.9%
Negative0
Negative (%)0.0%
Memory size389.4 KiB
2025-03-11T14:30:46.709382image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.89
Q11.82
median3
Q35.19
95-th percentile12.99
Maximum462.82
Range462.82
Interquartile range (IQR)3.37

Descriptive statistics

Standard deviation6.6324545
Coefficient of variation (CV)1.4468496
Kurtosis1419.5604
Mean4.5840664
Median Absolute Deviation (MAD)1.54
Skewness25.106102
Sum171182.79
Variance43.989453
MonotonicityNot monotonic
2025-03-11T14:30:46.897377image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.25 1313
 
2.6%
1 732
 
1.5%
2.99 587
 
1.2%
1.99 581
 
1.2%
3.99 567
 
1.1%
2 534
 
1.1%
3.98 506
 
1.0%
4.99 484
 
1.0%
0 473
 
0.9%
1.98 450
 
0.9%
Other values (1424) 31116
62.4%
(Missing) 12486
25.1%
ValueCountFrequency (%)
0 473
0.9%
0.01 3
 
< 0.1%
0.03 2
 
< 0.1%
0.04 2
 
< 0.1%
0.05 6
 
< 0.1%
0.07 1
 
< 0.1%
0.09 2
 
< 0.1%
0.1 4
 
< 0.1%
0.12 1
 
< 0.1%
0.13 3
 
< 0.1%
ValueCountFrequency (%)
462.82 2
< 0.1%
267.29 1
< 0.1%
238.17 2
< 0.1%
224.99 1
< 0.1%
139.31 1
< 0.1%
101.7 1
< 0.1%
100 1
< 0.1%
93.67 1
< 0.1%
90 2
< 0.1%
81.81 1
< 0.1%

CREATED_DATE
Date

MISSING 

Distinct90
Distinct (%)34.7%
Missing49570
Missing (%)99.5%
Memory size389.4 KiB
Minimum2017-07-21 19:42:14+00:00
Maximum2024-07-01 13:42:31+00:00
2025-03-11T14:30:47.087370image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:30:47.277365image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

BIRTH_DATE
Date

MISSING 

Distinct90
Distinct (%)34.7%
Missing49570
Missing (%)99.5%
Memory size389.4 KiB
Minimum1943-09-03 05:00:00+00:00
Maximum1997-02-25 00:00:00+00:00
2025-03-11T14:30:47.456360image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:30:47.660353image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

STATE
Categorical

HIGH CORRELATION  MISSING 

Distinct32
Distinct (%)12.4%
Missing49570
Missing (%)99.5%
Memory size389.4 KiB
FL
34 
IL
18 
PA
18 
NY
18 
NC
 
14
Other values (27)
157 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters518
Distinct characters21
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFL
2nd rowNY
3rd rowWI
4th rowWI
5th rowFL

Common Values

ValueCountFrequency (%)
FL 34
 
0.1%
IL 18
 
< 0.1%
PA 18
 
< 0.1%
NY 18
 
< 0.1%
NC 14
 
< 0.1%
WI 14
 
< 0.1%
GA 12
 
< 0.1%
CA 12
 
< 0.1%
VA 10
 
< 0.1%
OK 10
 
< 0.1%
Other values (22) 99
 
0.2%
(Missing) 49570
99.5%

Length

2025-03-11T14:30:47.839348image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
fl 34
 
13.1%
il 18
 
6.9%
pa 18
 
6.9%
ny 18
 
6.9%
nc 14
 
5.4%
wi 14
 
5.4%
ga 12
 
4.6%
ca 12
 
4.6%
va 10
 
3.9%
ok 10
 
3.9%
Other values (22) 99
38.2%

Most occurring characters

ValueCountFrequency (%)
A 68
13.1%
L 60
11.6%
N 54
10.4%
C 45
 
8.7%
I 42
 
8.1%
F 34
 
6.6%
O 26
 
5.0%
Y 26
 
5.0%
W 24
 
4.6%
T 21
 
4.1%
Other values (11) 118
22.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 518
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
A 68
13.1%
L 60
11.6%
N 54
10.4%
C 45
 
8.7%
I 42
 
8.1%
F 34
 
6.6%
O 26
 
5.0%
Y 26
 
5.0%
W 24
 
4.6%
T 21
 
4.1%
Other values (11) 118
22.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 518
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
A 68
13.1%
L 60
11.6%
N 54
10.4%
C 45
 
8.7%
I 42
 
8.1%
F 34
 
6.6%
O 26
 
5.0%
Y 26
 
5.0%
W 24
 
4.6%
T 21
 
4.1%
Other values (11) 118
22.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 518
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
A 68
13.1%
L 60
11.6%
N 54
10.4%
C 45
 
8.7%
I 42
 
8.1%
F 34
 
6.6%
O 26
 
5.0%
Y 26
 
5.0%
W 24
 
4.6%
T 21
 
4.1%
Other values (11) 118
22.8%

LANGUAGE
Categorical

HIGH CORRELATION  IMBALANCE  MISSING 

Distinct2
Distinct (%)0.8%
Missing49570
Missing (%)99.5%
Memory size389.4 KiB
en
247 
es-419
 
12

Length

Max length6
Median length2
Mean length2.1853282
Min length2

Characters and Unicode

Total characters566
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowen
2nd rowen
3rd rowen
4th rowen
5th rowen

Common Values

ValueCountFrequency (%)
en 247
 
0.5%
es-419 12
 
< 0.1%
(Missing) 49570
99.5%

Length

2025-03-11T14:30:47.998344image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-03-11T14:30:48.145952image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
en 247
95.4%
es-419 12
 
4.6%

Most occurring characters

ValueCountFrequency (%)
e 259
45.8%
n 247
43.6%
s 12
 
2.1%
- 12
 
2.1%
4 12
 
2.1%
1 12
 
2.1%
9 12
 
2.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 566
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 259
45.8%
n 247
43.6%
s 12
 
2.1%
- 12
 
2.1%
4 12
 
2.1%
1 12
 
2.1%
9 12
 
2.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 566
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 259
45.8%
n 247
43.6%
s 12
 
2.1%
- 12
 
2.1%
4 12
 
2.1%
1 12
 
2.1%
9 12
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 566
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 259
45.8%
n 247
43.6%
s 12
 
2.1%
- 12
 
2.1%
4 12
 
2.1%
1 12
 
2.1%
9 12
 
2.1%

GENDER
Categorical

HIGH CORRELATION  MISSING 

Distinct2
Distinct (%)0.8%
Missing49570
Missing (%)99.5%
Memory size389.4 KiB
female
215 
male
44 

Length

Max length6
Median length6
Mean length5.6602317
Min length4

Characters and Unicode

Total characters1466
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfemale
2nd rowmale
3rd rowfemale
4th rowfemale
5th rowmale

Common Values

ValueCountFrequency (%)
female 215
 
0.4%
male 44
 
0.1%
(Missing) 49570
99.5%

Length

2025-03-11T14:30:48.292948image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-03-11T14:30:48.429942image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
female 215
83.0%
male 44
 
17.0%

Most occurring characters

ValueCountFrequency (%)
e 474
32.3%
m 259
17.7%
a 259
17.7%
l 259
17.7%
f 215
14.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1466
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 474
32.3%
m 259
17.7%
a 259
17.7%
l 259
17.7%
f 215
14.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1466
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 474
32.3%
m 259
17.7%
a 259
17.7%
l 259
17.7%
f 215
14.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1466
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 474
32.3%
m 259
17.7%
a 259
17.7%
l 259
17.7%
f 215
14.7%

AGE
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct41
Distinct (%)15.8%
Missing49570
Missing (%)99.5%
Infinite0
Infinite (%)0.0%
Mean52.108108
Minimum28
Maximum81
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size389.4 KiB
2025-03-11T14:30:48.573938image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum28
5-th percentile32
Q140
median51
Q362.5
95-th percentile76
Maximum81
Range53
Interquartile range (IQR)22.5

Descriptive statistics

Standard deviation14.118405
Coefficient of variation (CV)0.27094449
Kurtosis-1.0257722
Mean52.108108
Median Absolute Deviation (MAD)11
Skewness0.19745528
Sum13496
Variance199.32935
MonotonicityNot monotonic
2025-03-11T14:30:48.754922image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
70 13
 
< 0.1%
60 12
 
< 0.1%
37 12
 
< 0.1%
35 12
 
< 0.1%
49 12
 
< 0.1%
44 10
 
< 0.1%
36 10
 
< 0.1%
46 10
 
< 0.1%
40 10
 
< 0.1%
43 10
 
< 0.1%
Other values (31) 148
 
0.3%
(Missing) 49570
99.5%
ValueCountFrequency (%)
28 8
< 0.1%
31 4
 
< 0.1%
32 2
 
< 0.1%
33 4
 
< 0.1%
34 6
< 0.1%
35 12
< 0.1%
36 10
< 0.1%
37 12
< 0.1%
38 2
 
< 0.1%
40 10
< 0.1%
ValueCountFrequency (%)
81 4
 
< 0.1%
80 2
 
< 0.1%
76 8
< 0.1%
75 4
 
< 0.1%
73 8
< 0.1%
71 8
< 0.1%
70 13
< 0.1%
69 2
 
< 0.1%
67 2
 
< 0.1%
66 4
 
< 0.1%

ACCOUNT_AGE_MONTHS
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct49
Distinct (%)18.9%
Missing49570
Missing (%)99.5%
Infinite0
Infinite (%)0.0%
Mean36.752896
Minimum8
Maximum91
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size389.4 KiB
2025-03-11T14:30:48.931927image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum8
5-th percentile9
Q122
median32
Q351.5
95-th percentile74
Maximum91
Range83
Interquartile range (IQR)29.5

Descriptive statistics

Standard deviation19.712683
Coefficient of variation (CV)0.53635727
Kurtosis-0.24103239
Mean36.752896
Median Absolute Deviation (MAD)13
Skewness0.63860915
Sum9519
Variance388.58987
MonotonicityNot monotonic
2025-03-11T14:30:49.105922image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
32 16
 
< 0.1%
29 14
 
< 0.1%
25 12
 
< 0.1%
53 11
 
< 0.1%
30 10
 
< 0.1%
55 10
 
< 0.1%
43 10
 
< 0.1%
51 8
 
< 0.1%
39 8
 
< 0.1%
33 8
 
< 0.1%
Other values (39) 152
 
0.3%
(Missing) 49570
99.5%
ValueCountFrequency (%)
8 8
< 0.1%
9 8
< 0.1%
10 6
< 0.1%
11 6
< 0.1%
13 2
 
< 0.1%
16 6
< 0.1%
17 6
< 0.1%
18 6
< 0.1%
19 6
< 0.1%
20 2
 
< 0.1%
ValueCountFrequency (%)
91 2
 
< 0.1%
90 2
 
< 0.1%
80 4
< 0.1%
77 2
 
< 0.1%
74 6
< 0.1%
72 4
< 0.1%
71 4
< 0.1%
68 2
 
< 0.1%
64 4
< 0.1%
62 2
 
< 0.1%

Interactions

2025-03-11T14:30:39.868716image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:29:13.426906image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:30:07.825758image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:30:23.955187image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:30:38.611744image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:30:40.665682image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:29:41.104973image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:30:23.416218image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:30:38.133768image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:30:39.419731image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:30:40.778688image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:29:53.080673image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:30:23.559199image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:30:38.271765image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:30:39.536716image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:30:40.895675image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:30:06.620734image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:30:23.689194image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:30:38.388761image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:30:39.655727image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:30:41.000672image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:30:07.209775image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:30:23.803191image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:30:38.501747image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2025-03-11T14:30:39.762719image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Correlations

2025-03-11T14:30:49.222919image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ACCOUNT_AGE_MONTHSAGEBARCODEFINAL_QUANTITYFINAL_SALEGENDERLANGUAGESTATE
ACCOUNT_AGE_MONTHS1.0000.0820.1090.0160.0470.2630.1150.502
AGE0.0821.000-0.013-0.005-0.1420.3070.2640.521
BARCODE0.109-0.0131.000-0.0050.0550.7090.7430.709
FINAL_QUANTITY0.016-0.005-0.0051.0000.0210.0000.1860.319
FINAL_SALE0.047-0.1420.0550.0211.0000.0000.0000.000
GENDER0.2630.3070.7090.0000.0001.0000.0420.523
LANGUAGE0.1150.2640.7430.1860.0000.0421.0000.675
STATE0.5020.5210.7090.3190.0000.5230.6751.000

Missing values

2025-03-11T14:30:41.169677image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
A simple visualization of nullity by column.
2025-03-11T14:30:41.523654image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2025-03-11T14:30:41.892818image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

RECEIPT_IDPURCHASE_DATESCAN_DATESTORE_NAMEUSER_IDBARCODEFINAL_QUANTITYFINAL_SALECREATED_DATEBIRTH_DATESTATELANGUAGEGENDERAGEACCOUNT_AGE_MONTHS
00000d256-4041-4a3e-adc4-5623fb6e0c992024-08-212024-08-21 14:19:06.539 ZWALMART63b73a7f3d310dceeabd475815300014978.01.0NaNNaTNaTNaNNaNNaNNaNNaN
10001455d-7a92-4a7b-a1d2-c747af1c8fd32024-07-202024-07-20 09:50:24.206 ZALDI62c08877baa38d1a1f6c211anan0.01.49NaTNaTNaNNaNNaNNaNNaN
200017e0a-7851-42fb-bfab-0baa96e235862024-08-182024-08-19 15:38:56.813 ZWALMART60842f207ac8b7729e47202078742229751.01.0NaNNaTNaTNaNNaNNaNNaNNaN
3000239aa-3478-453d-801e-66a82e39c8af2024-06-182024-06-19 11:03:37.468 ZFOOD LION63fcd7cea4f8442c3386b589783399746536.00.03.49NaTNaTNaNNaNNaNNaNNaN
400026b4c-dfe8-49dd-b026-4c2f0fd5c6a12024-07-042024-07-05 15:56:43.549 ZRANDALLS6193231ae9b3d75037b0f92847900501183.01.0NaNNaTNaTNaNNaNNaNNaNNaN
50002d8cd-1701-4cdd-a524-b70402e2dbc02024-06-242024-06-24 19:44:54.247 ZWALMART5dcc6c510040a012b8e76924681131411295.00.01.46NaTNaTNaNNaNNaNNaNNaN
6000550b2-1480-4c07-950f-ff601f2421522024-07-062024-07-06 19:27:48.586 ZWALMART5f850bc9cf9431165f3ac17549200905548.01.0NaNNaTNaTNaNNaNNaNNaNNaN
700096c49-8b04-42f9-88ce-941c5e06c4a72024-08-192024-08-21 17:35:21.902 ZTARGET6144f4f1f3ef696919f54b5c78300069942.00.03.59NaTNaTNaNNaNNaNNaNNaN
8000e1d35-15e5-46c6-b6b3-33653ed3d27e2024-08-132024-08-13 18:21:07.931 ZWALMART61a6d926f998e47aad33db6652000011227.01.0NaNNaTNaTNaNNaNNaNNaNNaN
90010d87d-1ad2-4e5e-9a25-cec736919d152024-08-042024-08-04 18:01:47.787 ZALDI66686fc2e04f743a096ea808nan0.02.29NaTNaTNaNNaNNaNNaNNaN
RECEIPT_IDPURCHASE_DATESCAN_DATESTORE_NAMEUSER_IDBARCODEFINAL_QUANTITYFINAL_SALECREATED_DATEBIRTH_DATESTATELANGUAGEGENDERAGEACCOUNT_AGE_MONTHS
49819441b9ecd-38ed-4960-9780-eb44a464284a2024-06-262024-07-02 09:37:07.656 ZFRY'S FOOD STORE6251c788e3d6762c55855c1d72250021081.01.02.49NaTNaTNaNNaNNaNNaNNaN
49820840c30ae-bc0a-40a4-a47d-052ed0af2da22024-08-182024-08-18 14:44:02.530 ZCOSTCO65b322787050d0a6206b347914074349.01.011.99NaTNaTNaNNaNNaNNaNNaN
4982168f74fb3-ccf2-41f3-896a-799eb9a806802024-08-132024-08-19 11:06:59.023 ZPEPPERIDGE FARM64f4aee2b84ba41db3fb246a14100071198.01.02.89NaTNaTNaNNaNNaNNaNNaN
49822f6d3e61d-488d-448b-8148-8d681e55b3d22024-09-012024-09-06 08:03:54.617 ZTARGET61056fcc1efef449f0f39f7c85239042663.01.03.46NaTNaTNaNNaNNaNNaNNaN
498236cdf3c1a-78b3-4fb0-85fd-52e2f5b4731c2024-06-262024-07-01 11:00:39.769 ZHARRIS TEETER5de7ec93ca63cc17893cdd14nan1.03.00NaTNaTNaNNaNNaNNaNNaN
49824b5cd61a9-8033-4913-a5c4-fb3f65e3a3212024-08-212024-08-31 14:13:08.634 ZTARGET6154bcf098f885648de2f29985239110669.02.01.18NaTNaTNaNNaNNaNNaNNaN
49825e1b2f634-c9ad-4152-b662-4b22efc258622024-08-112024-08-11 18:15:56.736 ZSTOP & SHOP60aa809f188b926b2244c97446100400555.01.02.00NaTNaTNaNNaNNaNNaNNaN
49826b07ef8dd-e444-40a2-819b-f74a3e5f1ae72024-07-112024-07-11 08:03:25.816 ZWALMART60bd26e83dc3b13a15c5f4e7646630019670.01.020.96NaTNaTNaNNaNNaNNaNNaN
4982742475141-bef4-4df2-aa37-72577e2512bb2024-06-182024-06-18 19:57:32.211 ZMARKET BASKET6169912fac47744405af62b741800501519.01.03.00NaTNaTNaNNaNNaNNaNNaN
498283a179c4e-46f2-4126-b3d2-3514afc23a3e2024-08-072024-08-07 15:30:07.911 ZWALMART64e94d64ca929250373ef6e1307660745853.01.05.48NaTNaTNaNNaNNaNNaNNaN